fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking by ygree · Pull Request #10644 · DataDog/dd-trace-java

ygree · 2026-02-19T21:45:14Z

What Does This Do

Aligns OpenAI Java LLMObs span payloads with expected intake/system-test schema by:

Adding/filling missing LLMObs tags:
- _ml_obs_tag.integration
- _ml_obs_tag.source
- _ml_obs_tag.ddtrace.version
- _ml_obs_tag.error
- _ml_obs_tag.error_type
Ensuring model_name (and stable placeholder output where applicable) is set on error paths for
chat/completions/embeddings/responses.
Expanding Responses instrumentation:
- prompt tracking (input.prompt, variables, chat_template)
- tool definition extraction (tool_definitions)
- tool call/result extraction across function/custom/MCP outputs
- metadata normalization (stream, tool_choice, text.verbosity, etc.)
Refactoring JSON conversion via shared JsonValueUtils.
Updating LLMObs mapper payload shape:
- writes _dd map with span/trace ids
- nests error fields under meta.error
- supports map-based LLM input serialization (messages + prompt)
- remaps tool_definitions into meta.
Updating tests to add value-level assertions for the above behavior.

Motivation

OpenAI/LLMObs system tests exposed schema and tag mismatches in Java payloads (especially response spans, tool metadata, error mapping, and prompt tracking structure). This change brings Java output in line with expected LLMObs intake contract and behavior.

Additional Notes

openai-java-3.0 min version updated from 3.0.0 to 3.0.1.

DataDog/dd-apm-test-agent#280
DataDog/system-tests#6364

Contributor Checklist

Format the title according to the contribution guidelines
Assign the type: and (comp: or inst:) labels in addition to any other useful labels
Avoid using close, fix, or any linking keywords when referencing an issue
Use solves instead, and assign the PR milestone to the issue
Update the CODEOWNERS file on source file addition, migration, or deletion
Update public documentation with any new configuration flags or behaviors

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

pr-commenter · 2026-02-19T22:33:17Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	ygree/llmobs-systest-fixes
git_commit_date	1772749357	1772798039
git_commit_sha	`4fd66d4`	`0c879ba`
release_version	1.61.0-SNAPSHOT~4fd66d45a9	1.60.0-SNAPSHOT~0c879ba692

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1772799692	1772799692
ci_job_id	1482775562	1482775562
ci_pipeline_id	100871637	100871637
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-slo2c1ym 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-slo2c1ym 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 61 metrics, 10 unstable metrics.

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.60.0-SNAPSHOT~0c879ba692, baseline=1.61.0-SNAPSHOT~4fd66d45a9

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.067 s) : 0, 1066582
Total [baseline] (11.121 s) : 0, 11121233
Agent [candidate] (1.066 s) : 0, 1065902
Total [candidate] (11.028 s) : 0, 11027967
section appsec
Agent [baseline] (1.248 s) : 0, 1248205
Total [baseline] (11.103 s) : 0, 11103052
Agent [candidate] (1.262 s) : 0, 1262238
Total [candidate] (11.226 s) : 0, 11226486
section iast
Agent [baseline] (1.227 s) : 0, 1227251
Total [baseline] (11.331 s) : 0, 11331082
Agent [candidate] (1.227 s) : 0, 1227053
Total [candidate] (11.259 s) : 0, 11259477
section profiling
Agent [baseline] (1.182 s) : 0, 1182109
Total [baseline] (11.025 s) : 0, 11024948
Agent [candidate] (1.182 s) : 0, 1181999
Total [candidate] (10.975 s) : 0, 10974843

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.067 s	-
Agent	appsec	1.248 s	181.623 ms (17.0%)
Agent	iast	1.227 s	160.668 ms (15.1%)
Agent	profiling	1.182 s	115.527 ms (10.8%)
Total	tracing	11.121 s	-
Total	appsec	11.103 s	-18.181 ms (-0.2%)
Total	iast	11.331 s	209.849 ms (1.9%)
Total	profiling	11.025 s	-96.285 ms (-0.9%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.066 s	-
Agent	appsec	1.262 s	196.336 ms (18.4%)
Agent	iast	1.227 s	161.151 ms (15.1%)
Agent	profiling	1.182 s	116.097 ms (10.9%)
Total	tracing	11.028 s	-
Total	appsec	11.226 s	198.519 ms (1.8%)
Total	iast	11.259 s	231.51 ms (2.1%)
Total	profiling	10.975 s	-53.124 ms (-0.5%)

gantt
    title petclinic - break down per module: candidate=1.60.0-SNAPSHOT~0c879ba692, baseline=1.61.0-SNAPSHOT~4fd66d45a9

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.203 ms) : 0, 1203
crashtracking [candidate] (1.197 ms) : 0, 1197
BytebuddyAgent [baseline] (632.864 ms) : 0, 632864
BytebuddyAgent [candidate] (633.286 ms) : 0, 633286
AgentMeter [baseline] (29.336 ms) : 0, 29336
AgentMeter [candidate] (29.342 ms) : 0, 29342
GlobalTracer [baseline] (258.493 ms) : 0, 258493
GlobalTracer [candidate] (258.707 ms) : 0, 258707
AppSec [baseline] (31.799 ms) : 0, 31799
AppSec [candidate] (31.701 ms) : 0, 31701
Debugger [baseline] (60.048 ms) : 0, 60048
Debugger [candidate] (59.693 ms) : 0, 59693
Remote Config [baseline] (596.948 µs) : 0, 597
Remote Config [candidate] (584.709 µs) : 0, 585
Telemetry [baseline] (8.783 ms) : 0, 8783
Telemetry [candidate] (8.656 ms) : 0, 8656
Flare Poller [baseline] (7.332 ms) : 0, 7332
Flare Poller [candidate] (6.602 ms) : 0, 6602
section appsec
crashtracking [baseline] (1.191 ms) : 0, 1191
crashtracking [candidate] (1.196 ms) : 0, 1196
BytebuddyAgent [baseline] (659.811 ms) : 0, 659811
BytebuddyAgent [candidate] (668.308 ms) : 0, 668308
AgentMeter [baseline] (12.017 ms) : 0, 12017
AgentMeter [candidate] (12.172 ms) : 0, 12172
GlobalTracer [baseline] (258.68 ms) : 0, 258680
GlobalTracer [candidate] (261.788 ms) : 0, 261788
AppSec [baseline] (177.448 ms) : 0, 177448
AppSec [candidate] (178.53 ms) : 0, 178530
Debugger [baseline] (64.807 ms) : 0, 64807
Debugger [candidate] (66.01 ms) : 0, 66010
Remote Config [baseline] (569.625 µs) : 0, 570
Remote Config [candidate] (586.717 µs) : 0, 587
Telemetry [baseline] (9.759 ms) : 0, 9759
Telemetry [candidate] (9.13 ms) : 0, 9130
Flare Poller [baseline] (3.625 ms) : 0, 3625
Flare Poller [candidate] (3.65 ms) : 0, 3650
IAST [baseline] (23.997 ms) : 0, 23997
IAST [candidate] (24.468 ms) : 0, 24468
section iast
crashtracking [baseline] (1.194 ms) : 0, 1194
crashtracking [candidate] (1.186 ms) : 0, 1186
BytebuddyAgent [baseline] (796.331 ms) : 0, 796331
BytebuddyAgent [candidate] (796.066 ms) : 0, 796066
AgentMeter [baseline] (11.298 ms) : 0, 11298
AgentMeter [candidate] (11.301 ms) : 0, 11301
GlobalTracer [baseline] (247.443 ms) : 0, 247443
GlobalTracer [candidate] (247.321 ms) : 0, 247321
AppSec [baseline] (27.287 ms) : 0, 27287
AppSec [candidate] (26.411 ms) : 0, 26411
Debugger [baseline] (62.383 ms) : 0, 62383
Debugger [candidate] (63.27 ms) : 0, 63270
Remote Config [baseline] (526.863 µs) : 0, 527
Remote Config [candidate] (525.138 µs) : 0, 525
Telemetry [baseline] (14.794 ms) : 0, 14794
Telemetry [candidate] (14.861 ms) : 0, 14861
Flare Poller [baseline] (4.898 ms) : 0, 4898
Flare Poller [candidate] (5.088 ms) : 0, 5088
IAST [baseline] (25.172 ms) : 0, 25172
IAST [candidate] (25.134 ms) : 0, 25134
section profiling
ProfilingAgent [baseline] (93.755 ms) : 0, 93755
ProfilingAgent [candidate] (93.731 ms) : 0, 93731
crashtracking [baseline] (1.16 ms) : 0, 1160
crashtracking [candidate] (1.175 ms) : 0, 1175
BytebuddyAgent [baseline] (683.354 ms) : 0, 683354
BytebuddyAgent [candidate] (683.254 ms) : 0, 683254
AgentMeter [baseline] (8.587 ms) : 0, 8587
AgentMeter [candidate] (8.566 ms) : 0, 8566
GlobalTracer [baseline] (215.345 ms) : 0, 215345
GlobalTracer [candidate] (215.619 ms) : 0, 215619
AppSec [baseline] (31.866 ms) : 0, 31866
AppSec [candidate] (31.828 ms) : 0, 31828
Debugger [baseline] (61.093 ms) : 0, 61093
Debugger [candidate] (64.008 ms) : 0, 64008
Remote Config [baseline] (592.758 µs) : 0, 593
Remote Config [candidate] (572.912 µs) : 0, 573
Telemetry [baseline] (12.099 ms) : 0, 12099
Telemetry [candidate] (8.952 ms) : 0, 8952
Flare Poller [baseline] (3.521 ms) : 0, 3521
Flare Poller [candidate] (3.463 ms) : 0, 3463
Profiling [baseline] (94.308 ms) : 0, 94308
Profiling [candidate] (94.289 ms) : 0, 94289

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.60.0-SNAPSHOT~0c879ba692, baseline=1.61.0-SNAPSHOT~4fd66d45a9

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.067 s) : 0, 1066778
Total [baseline] (8.864 s) : 0, 8863867
Agent [candidate] (1.059 s) : 0, 1058746
Total [candidate] (8.813 s) : 0, 8813442
section iast
Agent [baseline] (1.228 s) : 0, 1227825
Total [baseline] (9.587 s) : 0, 9587360
Agent [candidate] (1.229 s) : 0, 1228562
Total [candidate] (9.545 s) : 0, 9545414

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.067 s	-
Agent	iast	1.228 s	161.047 ms (15.1%)
Total	tracing	8.864 s	-
Total	iast	9.587 s	723.493 ms (8.2%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.059 s	-
Agent	iast	1.229 s	169.816 ms (16.0%)
Total	tracing	8.813 s	-
Total	iast	9.545 s	731.971 ms (8.3%)

gantt
    title insecure-bank - break down per module: candidate=1.60.0-SNAPSHOT~0c879ba692, baseline=1.61.0-SNAPSHOT~4fd66d45a9

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.209 ms) : 0, 1209
crashtracking [candidate] (1.19 ms) : 0, 1190
BytebuddyAgent [baseline] (634.471 ms) : 0, 634471
BytebuddyAgent [candidate] (628.269 ms) : 0, 628269
AgentMeter [baseline] (29.244 ms) : 0, 29244
AgentMeter [candidate] (29.124 ms) : 0, 29124
GlobalTracer [baseline] (258.819 ms) : 0, 258819
GlobalTracer [candidate] (256.764 ms) : 0, 256764
AppSec [baseline] (31.851 ms) : 0, 31851
AppSec [candidate] (31.515 ms) : 0, 31515
Debugger [baseline] (59.223 ms) : 0, 59223
Debugger [candidate] (58.58 ms) : 0, 58580
Remote Config [baseline] (605.179 µs) : 0, 605
Remote Config [candidate] (600.067 µs) : 0, 600
Telemetry [baseline] (8.718 ms) : 0, 8718
Telemetry [candidate] (8.666 ms) : 0, 8666
Flare Poller [baseline] (6.375 ms) : 0, 6375
Flare Poller [candidate] (7.921 ms) : 0, 7921
section iast
crashtracking [baseline] (1.19 ms) : 0, 1190
crashtracking [candidate] (1.204 ms) : 0, 1204
BytebuddyAgent [baseline] (796.813 ms) : 0, 796813
BytebuddyAgent [candidate] (797.644 ms) : 0, 797644
AgentMeter [baseline] (11.301 ms) : 0, 11301
AgentMeter [candidate] (11.352 ms) : 0, 11352
GlobalTracer [baseline] (247.268 ms) : 0, 247268
GlobalTracer [candidate] (247.674 ms) : 0, 247674
IAST [baseline] (25.243 ms) : 0, 25243
IAST [candidate] (25.148 ms) : 0, 25148
AppSec [baseline] (28.102 ms) : 0, 28102
AppSec [candidate] (26.453 ms) : 0, 26453
Debugger [baseline] (61.322 ms) : 0, 61322
Debugger [candidate] (62.675 ms) : 0, 62675
Remote Config [baseline] (524.037 µs) : 0, 524
Remote Config [candidate] (525.457 µs) : 0, 525
Telemetry [baseline] (15.065 ms) : 0, 15065
Telemetry [candidate] (14.95 ms) : 0, 14950
Flare Poller [baseline] (4.955 ms) : 0, 4955
Flare Poller [candidate] (4.844 ms) : 0, 4844

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	ygree/llmobs-systest-fixes
git_commit_date	1772749357	1772798039
git_commit_sha	`4fd66d4`	`0c879ba`
release_version	1.61.0-SNAPSHOT~4fd66d45a9	1.60.0-SNAPSHOT~0c879ba692

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1772800263	1772800263
ci_job_id	1482775563	1482775563
ci_pipeline_id	100871637	100871637
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-1-p5ihk4x4 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-1-p5ihk4x4 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 2 performance regressions! Performance is the same for 17 metrics, 17 unstable metrics.

scenario	Δ mean agg_http_req_duration_p50	Δ mean agg_http_req_duration_p95	Δ mean throughput	candidate mean agg_http_req_duration_p50	candidate mean agg_http_req_duration_p95	candidate mean throughput	baseline mean agg_http_req_duration_p50	baseline mean agg_http_req_duration_p95	baseline mean throughput
scenario:load:insecure-bank:iast:high_load	worse [+109.703µs; +208.845µs] or [+4.588%; +8.734%]	unsure [+81.717µs; +447.536µs] or [+1.149%; +6.292%]	unstable [-210.525op/s; +70.463op/s] or [-14.289%; +4.782%]	2.550ms	7.377ms	1403.344op/s	2.391ms	7.112ms	1473.375op/s
scenario:load:petclinic:profiling:high_load	worse [+500.421µs; +1384.895µs] or [+2.739%; +7.580%]	unsure [+0.340ms; +1.675ms] or [+1.148%; +5.653%]	unstable [-34.453op/s; +13.578op/s] or [-13.724%; +5.409%]	19.213ms	30.640ms	240.594op/s	18.270ms	29.633ms	251.031op/s

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~0c879ba692, baseline=1.61.0-SNAPSHOT~4fd66d45a9
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.246 ms) : 18057, 18436
.   : milestone, 18246,
appsec (18.345 ms) : 18162, 18529
.   : milestone, 18345,
code_origins (17.934 ms) : 17755, 18112
.   : milestone, 17934,
iast (17.666 ms) : 17494, 17839
.   : milestone, 17666,
profiling (18.599 ms) : 18416, 18782
.   : milestone, 18599,
tracing (17.95 ms) : 17772, 18129
.   : milestone, 17950,
section candidate
no_agent (19.177 ms) : 18978, 19376
.   : milestone, 19177,
appsec (18.488 ms) : 18305, 18671
.   : milestone, 18488,
code_origins (17.632 ms) : 17459, 17805
.   : milestone, 17632,
iast (17.736 ms) : 17560, 17911
.   : milestone, 17736,
profiling (19.404 ms) : 19209, 19599
.   : milestone, 19404,
tracing (17.782 ms) : 17605, 17960
.   : milestone, 17782,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	18.246 ms [18.057 ms, 18.436 ms]	-
appsec	18.345 ms [18.162 ms, 18.529 ms]	99.213 µs (0.5%)
code_origins	17.934 ms [17.755 ms, 18.112 ms]	-312.562 µs (-1.7%)
iast	17.666 ms [17.494 ms, 17.839 ms]	-579.973 µs (-3.2%)
profiling	18.599 ms [18.416 ms, 18.782 ms]	352.543 µs (1.9%)
tracing	17.95 ms [17.772 ms, 18.129 ms]	-295.814 µs (-1.6%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	19.177 ms [18.978 ms, 19.376 ms]	-
appsec	18.488 ms [18.305 ms, 18.671 ms]	-689.093 µs (-3.6%)
code_origins	17.632 ms [17.459 ms, 17.805 ms]	-1.545 ms (-8.1%)
iast	17.736 ms [17.56 ms, 17.911 ms]	-1.441 ms (-7.5%)
profiling	19.404 ms [19.209 ms, 19.599 ms]	226.864 µs (1.2%)
tracing	17.782 ms [17.605 ms, 17.96 ms]	-1.395 ms (-7.3%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~0c879ba692, baseline=1.61.0-SNAPSHOT~4fd66d45a9
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.174 ms) : 1163, 1185
.   : milestone, 1174,
iast (3.103 ms) : 3060, 3146
.   : milestone, 3103,
iast_FULL (5.751 ms) : 5694, 5808
.   : milestone, 5751,
iast_GLOBAL (3.651 ms) : 3594, 3708
.   : milestone, 3651,
profiling (1.97 ms) : 1953, 1986
.   : milestone, 1970,
tracing (1.768 ms) : 1754, 1783
.   : milestone, 1768,
section candidate
no_agent (1.211 ms) : 1199, 1224
.   : milestone, 1211,
iast (3.26 ms) : 3217, 3303
.   : milestone, 3260,
iast_FULL (5.884 ms) : 5825, 5943
.   : milestone, 5884,
iast_GLOBAL (3.646 ms) : 3583, 3708
.   : milestone, 3646,
profiling (1.923 ms) : 1906, 1940
.   : milestone, 1923,
tracing (1.807 ms) : 1792, 1823
.   : milestone, 1807,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.174 ms [1.163 ms, 1.185 ms]	-
iast	3.103 ms [3.06 ms, 3.146 ms]	1.929 ms (164.3%)
iast_FULL	5.751 ms [5.694 ms, 5.808 ms]	4.577 ms (389.9%)
iast_GLOBAL	3.651 ms [3.594 ms, 3.708 ms]	2.477 ms (211.0%)
profiling	1.97 ms [1.953 ms, 1.986 ms]	795.819 µs (67.8%)
tracing	1.768 ms [1.754 ms, 1.783 ms]	594.462 µs (50.6%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.211 ms [1.199 ms, 1.224 ms]	-
iast	3.26 ms [3.217 ms, 3.303 ms]	2.048 ms (169.1%)
iast_FULL	5.884 ms [5.825 ms, 5.943 ms]	4.672 ms (385.7%)
iast_GLOBAL	3.646 ms [3.583 ms, 3.708 ms]	2.434 ms (200.9%)
profiling	1.923 ms [1.906 ms, 1.94 ms]	711.748 µs (58.7%)
tracing	1.807 ms [1.792 ms, 1.823 ms]	595.856 µs (49.2%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	ygree/llmobs-systest-fixes
git_commit_date	1772749357	1772798039
git_commit_sha	`4fd66d4`	`0c879ba`
release_version	1.61.0-SNAPSHOT~4fd66d45a9	1.60.0-SNAPSHOT~0c879ba692

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1772800036	1772800036
ci_job_id	1482775564	1482775564
ci_pipeline_id	100871637	100871637
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-m4iopm60 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-m4iopm60 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 0 unstable metrics.

scenario	Δ mean execution_time	candidate mean execution_time	baseline mean execution_time
scenario:dacapo:tomcat:appsec	better [-1.384ms; -1.045ms] or [-37.060%; -27.979%]	2.520ms	3.734ms

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~0c879ba692, baseline=1.61.0-SNAPSHOT~4fd66d45a9
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.469 ms) : 1457, 1480
.   : milestone, 1469,
appsec (3.734 ms) : 3518, 3950
.   : milestone, 3734,
iast (2.252 ms) : 2183, 2321
.   : milestone, 2252,
iast_GLOBAL (2.294 ms) : 2225, 2364
.   : milestone, 2294,
profiling (2.102 ms) : 2046, 2158
.   : milestone, 2102,
tracing (2.065 ms) : 2011, 2118
.   : milestone, 2065,
section candidate
no_agent (1.477 ms) : 1465, 1489
.   : milestone, 1477,
appsec (2.52 ms) : 2464, 2575
.   : milestone, 2520,
iast (2.242 ms) : 2173, 2311
.   : milestone, 2242,
iast_GLOBAL (2.299 ms) : 2229, 2368
.   : milestone, 2299,
profiling (2.091 ms) : 2034, 2147
.   : milestone, 2091,
tracing (2.068 ms) : 2014, 2122
.   : milestone, 2068,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.469 ms [1.457 ms, 1.48 ms]	-
appsec	3.734 ms [3.518 ms, 3.95 ms]	2.265 ms (154.2%)
iast	2.252 ms [2.183 ms, 2.321 ms]	783.272 µs (53.3%)
iast_GLOBAL	2.294 ms [2.225 ms, 2.364 ms]	825.502 µs (56.2%)
profiling	2.102 ms [2.046 ms, 2.158 ms]	633.266 µs (43.1%)
tracing	2.065 ms [2.011 ms, 2.118 ms]	596.039 µs (40.6%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.477 ms [1.465 ms, 1.489 ms]	-
appsec	2.52 ms [2.464 ms, 2.575 ms]	1.043 ms (70.6%)
iast	2.242 ms [2.173 ms, 2.311 ms]	764.552 µs (51.8%)
iast_GLOBAL	2.299 ms [2.229 ms, 2.368 ms]	821.694 µs (55.6%)
profiling	2.091 ms [2.034 ms, 2.147 ms]	613.526 µs (41.5%)
tracing	2.068 ms [2.014 ms, 2.122 ms]	590.88 µs (40.0%)

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~0c879ba692, baseline=1.61.0-SNAPSHOT~4fd66d45a9
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.802 s) : 14802000, 14802000
.   : milestone, 14802000,
appsec (14.834 s) : 14834000, 14834000
.   : milestone, 14834000,
iast (18.242 s) : 18242000, 18242000
.   : milestone, 18242000,
iast_GLOBAL (17.806 s) : 17806000, 17806000
.   : milestone, 17806000,
profiling (14.553 s) : 14553000, 14553000
.   : milestone, 14553000,
tracing (15.109 s) : 15109000, 15109000
.   : milestone, 15109000,
section candidate
no_agent (15.1 s) : 15100000, 15100000
.   : milestone, 15100000,
appsec (14.957 s) : 14957000, 14957000
.   : milestone, 14957000,
iast (18.315 s) : 18315000, 18315000
.   : milestone, 18315000,
iast_GLOBAL (17.847 s) : 17847000, 17847000
.   : milestone, 17847000,
profiling (15.222 s) : 15222000, 15222000
.   : milestone, 15222000,
tracing (15.239 s) : 15239000, 15239000
.   : milestone, 15239000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	14.802 s [14.802 s, 14.802 s]	-
appsec	14.834 s [14.834 s, 14.834 s]	32.0 ms (0.2%)
iast	18.242 s [18.242 s, 18.242 s]	3.44 s (23.2%)
iast_GLOBAL	17.806 s [17.806 s, 17.806 s]	3.004 s (20.3%)
profiling	14.553 s [14.553 s, 14.553 s]	-249.0 ms (-1.7%)
tracing	15.109 s [15.109 s, 15.109 s]	307.0 ms (2.1%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.1 s [15.1 s, 15.1 s]	-
appsec	14.957 s [14.957 s, 14.957 s]	-143.0 ms (-0.9%)
iast	18.315 s [18.315 s, 18.315 s]	3.215 s (21.3%)
iast_GLOBAL	17.847 s [17.847 s, 17.847 s]	2.747 s (18.2%)
profiling	15.222 s [15.222 s, 15.222 s]	122.0 ms (0.8%)
tracing	15.239 s [15.239 s, 15.239 s]	139.0 ms (0.9%)

…wthTestOpenAiLlmInteractions::test_completion

…teractions::test_chat_completion_tool_call

…d with python openai instrumentation and system-tests

… with variables + chat_template, longest-first overlap handling) and support map-based LLM input serialization (messages + prompt) in LLMObs mapper. Also filter empty instruction messages to match system-test expectations.

…st and return [image] (not empty) when stripped input_image URLs are missing, aligning mixed-input chat_template output with expected behavior.

…output.messages from request params so existing error-span tests pass.

…ol_definitions tags

…JSON argument parsing and remove duplicate manual parsing logic from ResponseDecorator.

ygree self-assigned this Feb 19, 2026

ygree added comp: mlobs ML Observability (LLMObs) type: bug Bug report and fix labels Feb 19, 2026

llmobs: set model tag even when llmobs disabled

cbd6226

ygree force-pushed the ygree/llmobs-systest-fixes branch from 5cd257e to cbd6226 Compare February 24, 2026 09:31

ygree changed the title ~~llmobs: set model tag even when llmobs disabled~~ fix(llmobs): set model tag even when llmobs disabled Mar 2, 2026

ygree added 23 commits March 2, 2026 13:30

Set metadata.stream tag no matter it's true or false

4f27673

Set chat/completion CACHE_READ_INPUT_TOKENS tag

d128d6b

Set error nad error_type tags

3fc5ceb

Use "" instead of null for the role in CompletionDecorator to comply …

021a9d1

…wthTestOpenAiLlmInteractions::test_completion

Use "" instead of null for the content to comply with TestOpenAiLlmIn…

0637931

…teractions::test_chat_completion_tool_call

Add missing metatadata.tool_choice

0cb41e1

Add missing tool_definitions

a42f8aa

Add source:integration tag

6e10255

Add missing _dd attribute to the llmobs span event

34f3a07

Add missing error tags

a0c1139

Remove error from the llmobs span event. It must be part of meta block

effc343

Add missing meta.text.verbosity

c0e3876

Add summaryText and encrypted_content

b000770

Add missing tool_calls and tool_results for responses

53471a2

Always set stream param to produce the same request body to be aligne…

2207c46

…d with python openai instrumentation and system-tests

Fix OpenAI Responses prompt tracking to use response instructions fir…

7d683b6

…st and return [image] (not empty) when stripped input_image URLs are missing, aligning mixed-input chat_template output with expected behavior.

Set LLMObs error-path defaults in Java to always emit model_name and …

2c17ddc

…output.messages from request params so existing error-span tests pass.

Add OpenAI Responses tool definition extraction to populate LLMObs to…

ad3b782

…ol_definitions tags

Fix ChatCompletionServiceTest

1810327

Extract JsonValueUtils

46221e4

Refactor OpenAI responses instrumentation to reuse ToolCallExtractor …

61ad667

…JSON argument parsing and remove duplicate manual parsing logic from ResponseDecorator.

Fix test assertions

f0957b7

ygree added 5 commits March 6, 2026 10:35

Add integration tag

f3f1f75

Add ddtrace.verion

668e955

Improve test assertions

d57402e

Merge branch 'master' into ygree/llmobs-systest-fixes

a3051e3

Fix format

0c879ba

ygree changed the title ~~fix(llmobs): set model tag even when llmobs disabled~~ fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking Mar 6, 2026

ygree added tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes labels Mar 6, 2026

ygree marked this pull request as ready for review March 6, 2026 13:46

ygree requested review from a team as code owners March 6, 2026 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking#10644

fix(llmobs): openai-java payload mapping for responses, tool metadata, and prompt tracking#10644
ygree wants to merge 29 commits intomasterfrom
ygree/llmobs-systest-fixes

ygree commented Feb 19, 2026 •

edited

Loading

Uh oh!

pr-commenter bot commented Feb 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ygree commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Uh oh!

pr-commenter bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ygree commented Feb 19, 2026 •

edited

Loading

pr-commenter bot commented Feb 19, 2026 •

edited

Loading